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Foreword 


The  mathematical  basis  of  the  Navy’s  multivariate  optimum 
interpolation  (MVOI)  analysis  system  is  presented  in  detail.  The 
analysis  is  used  operationally  to  analyze  data  for  the  Operational 
Global  Atmospheric  Prediction  System  (NOGAPS  3),  the 
Operational  Regional  Atmospheric  Prediction  System  (NORAPS  5), 
and  in  the  Navy's  forecaster  work  station,  the  Tactical  Environmental 
Support  System  (TESS  (3)).  The  purpose  of  this  document  is  to 
help  Navy  personnel  to  use  the  MVOI,  and  to  serve  as  documentation 
of  the  analysis  system. 
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Executive  Summary 


The  equations  for  the  Navy’s  multivariate  optimum  interpolation 
(MVOI)  analysis  system  for  atmospheric  analysis  are  presented  in 
complete  form.  Included  are  the  derivations  for  horizontal  and  vertical 
covariances  that  retain  the  geostrophic  constraint  and  blend  various 
kinds  of  observations  such  as  pressure  thickness,  pressure  height,  and 
winds.  Methods  that  scale  the  observation  values  to  the  resulting 
nondimensional  values  so  that  they  are  consistent  with  the  constraints 
are  presented. 

The  derivations  of  the  analysis  equations  are  provided  using  optimal 
theory  to  minimize  the  analysis  error  variance.  The  derivation  of  the 
internal  data  checking  procedure  is  given.  The  method  compares 
suspicious  values  with  neighboring  values  and  removes  rejected  values 
in  a  computationally  consistent  manner. 

Validation  experiments  that  show  the  exactness  of  the  constraints, 
expected  analysis  error,  and  the  method  for  analysis  of  wind  around  the 
poles  are  presented.  These  experiments  have  known  results,  and  therefore 
are  useful  in  catching  design  and  program  errors. 

A  brief  description  is  provided  of  the  observation  selection  procedures 
•  and  internal  quality  control,  and  some  recommendations  for  improvement 

are  given. 
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The  Fleet  Numerical  Oceanography  Center  (FNOC)  supports  Navy 
Fleet  operations.  To  be  effective,  the  center  has  had  to  evolve  into  a 
numerical  prediction  center  capable  of  support  throughout  the  world.  A 
key  component  of  this  support  is  the  Navy  Operational  Global  Atmospheric 
Prediction  System  (NOGAPS)  described  by  Rosmond  (1992)  and  Hogan 
and  Rosmond  (1991).  Besides  providing  an  analysis  and  prediction  of 
weather  patterns,  it  drives  other  predictions  such  as  the  Thermal  Ocean 
Prediction  System  (TOPS),  the  Global  Spectral  Ocean  Wave  Model 
(GSOWM),  and  the  WAve  Model  (WAM)  with  momentum  and  heat 
fluxes. 

With  continual  increases  in  computer  power  and  a  wider  diversity  of 
observation  platforms,  a  sophisticated  analysis  system  was  required  to 
effectively  assimilate  data  into  NOGAPS.  As  a  result,  the  multivariate 
optimum  interpolation  (MVOI)  analysis  was  developed  by  the  Naval 
Oceanographic  and  Atmospheric  Research  Laboratory  (NOARL).  This 
analysis  method  is  similar  to  those  developed  at  other  centers  such 
as  the  National  Center  for  Atmospheric  Research  (NCAR)  (Schlatter, 
1975),  the  Canadian  Atmospheric  Environment  Service  (Rutherford, 
1976,  1978),  the  National  Meteorological  Center  (Bergman,  1976, 1979), 
and  the  European  Center  for  Medium-Range  Weather  Forecasts  (ECMWF) 
(Lorenc,  1981).  Although  optimum  procedures  have  been  known  for 
over  30  years  (Gandin,  1963),  their  potential  was  not  fully  realized 
until  the  1980’s  when  sufficiently  powerful  computers  allowed  the 
optimum  interpolation  software  design  to  utilize  hundreds  of  data  values 
for  each  grid  point  rather  than  tens  of  values,  more  sophistication  in  the 
quality  control  procedures,  and  increased  accuracy  in  determining  the  data 
and  assimilation  system  error  characteristics  (see  Barker  and  Rosmond, 
1985).  ECMWF  exploited  these  options  on  their  supercomputers,  and 
many  of  their  findings  were  included  in  the  design  and  development  of 
the  Navy’s  system. 

The  Navy’s  MVOI  became  operational  early  in  1988  (Barker,  et  al., 
1988)  with  the  implementation  of  NOGAPS  3.0  The  new  system  included 
upgrades  in  all  components:  the  finite  difference  model  was  replaced  by 
a  spectral  model,  the  calculus  of  variations  initialization  was  replaced 
with  the  nonlinear  normal  mode  method,  and  the  successive  corrections 
analysis  was  replaced  with  the  MVOI. 

This  report  provides  the  formulations  and  reasoning  used  in  the 
development  of  the  MVOI,  and  some  of  the  validation  experiments  that 
also  show  functionality  of  the  design  options. 

A  major  portion  of  the  development  of  this  system  went  into  preparing 
the  raw  data  for  analysis.  This  involved  pulling  the  raw  data  from  the 
packed  format  in  die  FNOC  data  base,  reformatting  it,  and  then  transferring 
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2.0  Horizontal 
Covariances 


it  into  a  data  base  on  the  supercomputer.  Data  quality  control 
procedures  are  applied  on  the  CYBER  205  data  base  as  described 
in  Baker  (1991a-d).  Rather  than  develop  new  tables  for  a  new 
format,  the  one  developed  by  the  World  Meteorological  Organization 
for  the  Global  Atmospheric  Research  Program  (GARP)  and  the  First 
GARP  Global  Experiment  (FGGE)  was  adapted.  This  format  had  the 
added  advantage  of  easier  access  to  the  FGGE  data. 

Although  the  theory  underlying  the  MVOl  can  be  found  in  numerous 
papers,  it  is  given  in  compete  form  in  sections  2  through  5.  Much  of  the 
basic  work  was  patterned  after  Lorenc  et  al.  (1977)  and  Lorenc  (1981). 
The  additional  work  of  Daley  (1985)  to  include  parameters  for  geostrophic 
coupling  and  for  relative  magnitudes  of  the  divergence  and  rotational 
components  of  wind  are  also  included. 

Covariance  computations  on  polar  coordinates  and  transformation  to 
a  Cartesian  grid  greatly  simplify  computations;  these  also  are  presented 
in  complete  form. 

Code  validation  is  an  extremely  important  part  of  any  system 
development.  From  the  extensive  literature  describing  MVOI  methods, 
ways  of  testing  these  characteristics  can  be  found.  The  methods  most 
valuable  in  locating  design  and  coding  errors  are  presented  in  section  6. 
These  methods  made  the  implementation  of  the  analysis  methods  relatively 
error  free,  although  it  is  believed  that  totally  error-free  code  may  not  be 
possible  (Pamas,  1985;  Meyers,  1986). 

The  methods  used  to  select  the  data  can  make  a  difference  in  the 
smoothness  and  accuracy  of  the  analysis  (Barker,  1991).  The  data  selection 
procedure  picks  observations  for  volumes  of  grid  points,  and  the  size  of 
the  volumes  is  determined  by  observation  density.  The  strategy  used  for 
this  data  selection  procedure  is  given  in  section  7. 

Vector  computers,  such  as  the  CYBER  205,  are  most  efficient  when 
the  software  is  designed  so  that  computations  are  done  on  strings  of 
contiguous  variables.  The  computer  is  then  able  to  compute  values  in 
an  assembly  line  fashion,  resulting  in  ten-fold  increases  in  computation 
rates.  The  contiguous  strings,  or  vectors,  require  special  planning  starting 
with  the  theoretical  foundation.  For  example,  use  of  the  volume  method 
of  Lorenc  (1981),  where  data  values  are  selected  and  the  analysis  is 
made  for  volumes  of  grid  points  vice  individual  grid  points,  make  it 
possible  to  compose  computational  strings  of  over  300  data  values.  The 
computational  vector  length,  therefore,  is  typically  in  excess  of  300 
operations  long. 

Even  though  MVOI  theory  was  introduced  into  the  meteorological 
literature  over  30  years  ago,  the  MVOI  techniques  have  undergone 
dramatic  improvement  over  the  last  10  years.  Much  of  this  improvement 
is  due  to  the  increased  power  of  computers.  However,  to  achieve  the 
added  performance,  the  design  of  MVOI  requires  special  consideration 
for  setting  up  the  computation  for  vector  and  multiprocessor  computers. 

Finally,  some  recommendations  are  presented  in  section  8  that  serve 
to  point  out  some  of  the  weaknesses  discovered  in  the  system,  and  to 
make  some  recommendations  for  further  research. 

The  underlying  theory  of  optimum  interpolation  requires  and  accurate 
description  of  prediction  and  observation  errors  to  produce  a  truly  optimal 
solution.  Unfortunately,  optimal  theory  becomes  too  complicated  except 
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when  the  error  structure  of  the  predication  model  is  assumed  stationary 
and  homogeneous.  This  has  lead  some  investigators  to  class  MVOl  as 
commonly  used  as  a  statistical  method  rather  than  an  optimal  one 
(McPherson  et  al.,  1979)  since  the  true  optimal  solution  would  be  lost 
with  such  simplifying  assumptions.  Nevertheless,  it  is  hoped  that  the 
models  used  to  represent  the  prediction  errors  produce  all  but  the  minor 
improvements  that  would  be  gained  from  precise  statistical  models. 

When  a  simple  covariance  model  is  assumed,  the  short-term  prediction 
error  correlations  become: 

(vw}j  =  (^JF{rij)  (1) 

(x«X/)  =  (x  )g  ( rij)>  (2) 

and 

Uif\  =  (^jE(rij),  (3) 

where  y  is  streamfunction,  x  is  velocity  potential,  and  <J>  is  geopotential; 
r,y  is  the  distance  between  two  arbitrary  points  i  and  j;  F,  G,  and  E  are 
the  covariance  models  for  y,  x.  and  <j>,  respectively;  the  angle  brackets 
symbolize  an  ensemble  average  operator,  and  (y2>,  (x2),  and  (<j)2)  are 
the  prediction  error  variances. 

The  covariance  of  the  predicted  estimate  errors  between  y  and  x  is 
defined  as 


(v.X,)  =  (xi  ¥y)  =  V(y2)(x2)tf  (nj) . 


(4) 


where  H  is  the  covariance  model. 

Using  the  Cartesian  definition  for  distance 

'’v  =  V(z1-zy)2  +  (y,-yy)2,  (5) 


the  partial  derivatives  in  Cartesian  space  operating  on  an  arbitrary  function 
A  (riy)  becomes 


ai 

dxi 


A'*L=a'  &-*) 

dxt  V(*,-X;)2+(y,-y>)2 


(6) 


dA  Xi  -Xi 
—  =  — - A 

dxj  r 


(7) 
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In  these  equations  and  those  that  follow,  the  subscripts  on  r  are  dropped 
because  they  are  not  necessary  for  clarity. 

Second,  partial  derivatives  in  Cartesian  coordinate  space  operating 
on  A  have  the  following  form: 


d  a  _  d  ixj-xj  A,\_  [yi-yj){xi-xj) 

dyidxj  By,  '  r  I  r2 

d2A  _  _  (xi-Xj){yj-yj)  A„ _A^_ 
dxidyj  r2  'T 

-  (**  \a"~  — ]  ,  and 

Bx;Bx;  r  r2  n 


A"  -  — 
r 


BxfBxj  r  r1 
d2A  _  A/_  _  jyi-yj) 

dy.dy,  r  r2 


A"  -  — 
r 


With  the  relations  above,  the  covariances  defined  by  equations  (1) 
and  (2)  can  be  converted  to  measurable  quantities,  u,  the  wind  along  the 
x-axis,  and  v,  the  wind  along  the  y-axis.  Substituting  the  relationships 
between  streamfunction,  velocity  potential,  and  wind, 

+  (13) 

By  dx 


+  (14) 

Bx  By 

The  equation  for  spatial  covariances  for  the  u-component  of  wind 
becomes: 


\\  dy,  *Xi}\  dy>  dx; 


dy,  dyA  /ax,  dyA_/dy,-  dxA^/dx,-  dxA 

dy,  ByJ  \Bxi  By}  \dy,  BxJ  W  BxJ 
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It  is  reasonable  and  much  simpler  to  assume  that  the  rotational 
component  of  the  atmospheric  motion  is  uncorrelated  with  the 
divergent  component  so  that  (xiVj)  =  {x.j  V<)  =  0.  and 


M  =  T^—  M  +  im) 

dyt  dyj  oX[  oXj 

06) 

where  is  a  derivative  operator  with  respect  to  r  defined  as 


<R(A)=A"-  —  .  (17) 

r 

The  v,  -vj  covariances  are  similarly  derived  so  that 

08) 

Finally,  for  the  u(-  -v;  covariances 

r2 

-(^-^lf‘-^>W(x,x))].  (19) 

r2 

The  covariances  (16),  (18),  and  (19)  are  simplified  by  transforming 
winds  to  cylindrical  coordinates,  which  is  a  coordinate  system  in  the 
radial  direction,  r,  and  the  angle  this  direction  makes  with  the  x-axis, 
0,  i.e., 

v '  =  cv  (20) 

where 
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V'  = 


(22) 


and 


c  = 


cos0  sin0 
-  sin  0  cos  0 


(23) 


The  values  u'  and  v'  are  the  normal  and  tangential  velocity  components 
respectively. 

Since 


c-1  =  Adjoint  c  , 

the  reverse  of  equation  (20)  is  simply 
v  =  Adjoint  cv'. 

Using 

Xj-X  i 

-t - -  =  cos  0 

r 

and 

=  sin0  , 
r 


(24) 


(25) 


(26) 


(27) 


to  solve  for  the  velocity  covariances  in  cylindrical  coordinates  yields: 

ifiju 'jj  =  ((«,■  cos  0  +  v,-  sin ©)  (uj cos  0  +  v;  sin  0) ) 

=  (uiUj) cos2  0+2  («,vy)  sin0  cos  0  +(v,v/)sin2  0  , 
or 

'  r  dr  drz 

Similarly, 


(28) 


(29) 


(,.;r-\-  i  iiml . 

'  1  dr 2  r  dr 


(30) 


and 


k  *;)-(*/« ;)-o. 


(31) 
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These  covariances  are  now  easily  computed  once  the  relations  (1)  and 
(2)  are  defined. 

To  obtain  the  covariances  between  geopotential,  $,  and  wind,  the 
covariances  are  first  defined  in  Cartesian  coordinates  where 


W=(w(-?+ff 

\  dXjh 

) 

i 

1 

II 

,  xj~x‘  dM 

(32) 

r  dr 

r  dr  ’ 

M  =  -  M  - 

\  dxj  fyl 

) 

(33) 

_  xrxi  d(bWj)  yj-yi  d^Xj) 

i  * 

(34) 

r  dr  r  dr 


and 

(vi  fy)  =  -  (fcvy) .  (35) 


Covariances  for  a  geostrophically  coupled  system  can  be  modeled 
using 

M = M = j  (m)  Vzfj"  ’  (36) 

and 

(*.X))  =  M  =  0,  (37) 

where  /  is  the  Coriolis  parameter  and  p  is  a  function  of  the  degree  of 
geostrophic  coupling  desired  in  the  final  analysis.  As  p  ranges  from  0 
to  1,  the  modeled  covariance  between  wind  and  geopotential  ranges 
from  0,  uncoupled,  to  1,  full  geostrophic  coupling.  Fractional  values  of 
p  result  in  a  partially  geostrophic  system  as  discussed  in  Daley  (1985). 
Substituting  equation  (36)  into  equation  (32)  and  equation  (34)  gives 

and 

(39) 

r  f  dr  V  E(r) 
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Assuming  the  autocovariance  function  for  y,  F(r),  is  the  same  as  that 
for  <j>,  E  (r),  and  using  equations  (26),  (27),  (32),  and  (34)  to  derive  the 
covariances  in  cylindrical  coordinates  yields: 

(fiiuj)  =  (<j>,«j}cos  6  +  (fty,)  sin  0 
=  0  , 
and 

(<t>,  v')=  (<j>;  v;)sin  0  +  (<t>,  vy)cos  0 

d(h  4>j  lF(r) 
f  dr  V  E(r) 

It  is  possible  to  model  the  amount  of  divergence  or  rotation  in  the 
predicted  estimate  wind  error  through  an  appropriate  choice  of  covariance 
model  for  x  and  \|/.  Since  E  ( r is  readily  determined  from  observations, 
it  is  convenient  to  define 

F(r)  =  (l-v)£(r),  (42) 

and 

G  (r)  =  v£(r) .  (43) 

The  variable  v  then  becomes  the  parameter  to  model  the  divergence 
that  exists  in  the  wind  field.  As  v  approaches  1 ,  the  wind  error  models 
become  fully  divergent,  and  as  n  approaches  0,  the  wind  models  become 
rotational. 

The  covariance  models  (3),  (29),  (30),  and  (41)  have  now  been 
formulated  so  that: 

•  There  are  two  parameters  to  control  the  constraints  in  the  analysis: 
|A,  which  controls  the  geostrophic  coupling,  and  v,  which  controls  the 
divergence. 

•  They  are  in  cylindrical  coordinates,  which  is  simpler  than  in  Cartesian 
coordinates,  and  {w;v}),  (v •  «'•),  (v/  and  (<{ >;vj)  do  not  have  to  be 
computed  since  they  are  zero. 

•  Their  computation  is  independent  of  the  projection  of  the  analysis 
grid. 

•  They  only  depend  on  the  function  chosen  for  the  normalized 
covariance  model  for  geopotential,  E 

The  treatment  of  error  functions  in  this  section  assumed  that  the 
vertical  structure  was  independent  of  the  horizontal  structure.  This  makes 
the  computations  much  faster  than  using  a  fully  three-dimensional 
structure  function,  and  the  mathematics  are  also  easier.  The 
vertical  structure  formulations  are  presented  in  the  next  section. 


(40) 


(41) 
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The  definition  of  the  vertical  geopotential  covariance  is  defined  as 


(<M H(Pi’Pk)  ’  (44) 

where  the  subscripts  refer  to  two  arbitrary  pressure  levels  p,  and  pk,  and 
H  is  the  normalized  vertical  covariance  function  for  geopotential  and  a 
function  of  the  pressure  levels. 

The  geostrophic  coupling  relations  are  complicated  for  vertical 
covariances  unless  the  normalized  vertical  covariance  is  the  same  for 
both  wind  and  geopotential.  Consequently,  the  wind  vertical  covariances 
fake  the  form  of  equation  (44)  except  <p  is  replaced  by  u  or  v. 

Satellites  do  not  provide  a  reference  such  as  surface  pressure  to  their 
soundings,  so  the  temperature  they  measure  can  only  be  converted  to 
geopotential  thickness,  unless  a  reference  level  is  separately  analyzed 
beforehand.  Lorenc  (1981)  established  the  covariances  for  thickness  so  that 
the  satellite  temperatures  could  be  used  directly.  This  makes  the  use  of 
thickness  information  more  mathematically  precise,  as  the  optimality  does 
not  have  to  account  for  the  errors  in  a  separately  derived  reference 
level. 

The  vertical  thickness  covariance  is 


(D,Dk)  =  { (<>i  — — l)(<l>A: — -l )) 


=  *)  ~(<t>.-i  4 >*)  -  (4><4>*-i)  +  (4>;-i4>*-i)  .  (45) 

where  the  individual  terms  are  evaluated  from  equation  (44). 
Thickness  correlated  with  any  other  variable  such  as  Ak  is 

=  (46) 


4.0  Variable  Scaling 


The  optimum  interpolation  equations  are  typically  simplified 
by  normalizing  the  variables  by  the  prediction  error  variance. 
These  normalization  constants  can  be  tricky,  and  if  done  incorrectly, 
they  destroy  the  geostrophic  and  rotational  flow  constraints,  as  well  as 
the  statistical  properties  of  the  optimization. 

The  velocity  covariance  relations,  equations  (29)  and  (30),  evaluated 
in  the  limit  as  point  separation  goes  to  zero,  form  the  basis  for  the 
scaling  values,  i.e.. 


1  <*(w>)  d2{xiX ) 

r  dr  dr2 


and 


lim  d2(\fj\fj) 

r — >0  dr2 


I  d(XiXj) 

r  dr 


(47) 


(48) 
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The  prediction  error  estimate  of  the  wind  is  computed  from  the  background 
or  first-guess  wind  error  variance  because  its  dependence  on  latitude  is 
much  smaller  than  the  geopotential  first-guess  error  variance.  Then  the 
geopotential  first-guess  error  variance  is  estimated  from  the  error 
covariance  relationships.  To  make  the  scaling  values  compatible  with 
geostrophic  coupling,  the  geostrophic  coupling  parameter,  p,  is  set  to  1  for 
full  coupling  and  the  divergence  parameter,  v,  is  set  to  0  for  nondivergent 
flow.  The  streamfunction  prediction  error  variance  is  estimated  from 

the  geostrophic  approximation,  y,  where  <§p  is  the  error  variance  of  the 

predicted  estimate  of  geopotential,  and  as  such  becomes  the  square  of 
the  scaling  factor  for  geopotential.  The  relationship  for  the  scaling 
constants  is 


(49) 


(50) 


where  the  subscript  p  designates  the  prediction  error  variance. 
The  relationship 


lira  ^E(r)  =  lim  1  iB[l , 
r—>0  dr2  r_»0  r 


(51) 


and  the  condition  that  the  normalized  model  for  geopotential,  E,  be 
twice  differentiable  in  the  limit  as  r  goes  to  zero  (see  Franke,  et  al„ 
1988)  makes  possible  the  definition 


_  1_  dE(r)\ 
r  dr  I  ' 


(52) 


The  wind  and  geopotential  scaling  values  are  therefore  related  by 


(53) 


This  relation  must  be  satisfied  if  the  geostrophic  and  rotational  flow 
constraints  are  to  be  imposed. 

Wind  variance  is  nearly  the  same  in  the  tropics  as  it  is  in  the  mid¬ 
latitudes,  whereas  geopotential  variance  decreases  from  midlatitudes  to 
the  tropics.  As  a  result,  the  scaling  variances  are  determined  by  verifying 
short-term  predictions  of  wind  against  observations,  and  then  computing 
geopotential  scaling  constants  from  equation  (53). 
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Thickness  scaling  is  computed  from  equation  (45),  where 


5.0  Optimizing  the 
Analysis 


[Dl  (?*)]2=  ^P{Pk)H(Pk,Pk) 

-20P  (Pk  )%{Pk -l) H  (Pk.Pk- 1) 

+  tfP(Pk-i)H  (P*-i.Pi-i).  (54) 

Meteorological  analysis  methods  typically  involve  the  computation 
of  a  weighted  average  of  observation  values.  Written  in  general  vector 
matrix  form,  the  analysis  at  a  specific  point,  k,  is 


T 

r*  =  w* 


b\~P\ 
bi  -  Pi 


(55) 


where  bt  -p,  is  the  difference  between  the  observation  value  and  the 
prediction  value,  rk  is  the  analyzed  increment,  and  w  is  the  column 
matrix  containing  die  weighting  coefficients.  Optimization  of  this  equation 
can  be  done  in  many  ways  and  with  different  levels  of  success,  depending 
on  the  assumptions  necessary  to  keep  the  solution  tractable.  The  optimum 
interpolation  analysis  method  that  evolved  through  the  works  of  Gandin 
(1963),  Rutherford  (1976,  1978),  Schlatter  (1975),  Bergman  (1976,  1979), 
Lorenc  (1981),  Daley  (1985),  and  others  is  the  basis  for  the  optimization 
method  adapted  for  this  analysis.  This  analysis  method  requires  that  the 
mean-squared  error  of  the  analysis  be  minimized  over  statistically 
significant  realizations.  The  limitations  of  this  optimization  are  caused 
by  the  assumption  of  stationarity  and  homogeneity  in  the  computation 
of  the  variable  covariances  and  in  the  inexact  determination  of  the 
statistical  parameters  used. 

This  description  closely  follows  the  one  presented  by  Lorenc  (1981) 
and  Daley  (1985,  1991),  except  that  more  steps  are  provided  in  the 
derivations  to  make  them  easier  to  follow. 

Minimization  of  analysis  error  requires  the  definition  of  the  following 
error  parameters: 

•  Analysis  error  is  a  =  A  -  T,  where  A  is  the  analyzed  value  and  T 
is  the  true  value. 

•  Observation  error  is  b  =  B  -  T,  where  B  is  the  observed  value. 

•  Prediction  error  is  p  =  P  -  T,  where  P  is  the  predicted  value. 

•  Analysis  error  estimate  is  Ea  =  (a2)m. 

•  Observation  error  estimate  is  E  °  -  (b  2)in. 

•  Prediction  error  estimate  is  Ep  =  {p2)m. 

•  The  error  parameters  scaled  by  the  prediction  error  estimate  are: 


(56) 
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a  _  Ea 

G  , 

(57) 

Ep 

B-P 

Q  = -  . 

(58) 

EP 

A -P 
r  = - , 

(59) 

EP 

P 

it  =  —  , 

(60) 

EP 

(61) 

Ea 

B  =  —  ,  and 

(62) 

E° 

(63) 

£fl 

The  analysis  equation  scaled  by  the  values  of  expected  error  is 

b\  E  0  pi 
E0  Ep  Ep 


a_ 

EP 


(64) 


which  leads  to  the  equation  for  analysis  error 


o*  ej  =  jc*+  wjT 


M  -*t 


L  P.  e  °  -  w/  j 


(65) 


Squaring  the  analysis  error  equation  (65);  taking  the  ensemble  average; 
then  using  the  following  relations: 


I*;)2 


(66) 


(67) 
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gives  the  squared  analysis  error  equation, 


» 


(g*)  =  l-2w[h*  +  w[Mw*  , 


where 


(68) 


I 


I 


» 


I 


I 


I 


» 


M  = 


m  n  mXn 

...  mtJ  ... 
mn  t  •••  mnn 


mij  =  (wjl  -  e °i  (p;pj  e j  -  G  °i  (pi it]  -  (iCi  P]gJ  , 


(69) 


(70) 


and 


h  = 


(71**1)  "  (*tPi)et 

L  (»**]  ~  fep«)e® 


(71) 


The  optimality  in  the  analysis  occurs  when  the  differentiation  of  (68) 
with  respect  to  the  weighting  coefficients,  w*,  is  set  to  zero,  which 
leads  to 

w*  =  M-1h*.  (72) 

The  analysis  equation,  (55),  in  optimal  form  is  therefore 

r*  =  h[M-iq  .  (73) 

The  last  two  terms  of  this  equation  are  independent  of  the  analysis 
grid  location,  k,  so  they  can  be  combined  into  a  single  term 

c  =  M_1q.  (74) 


This  simplifies  the  analysis  computation  to 

r*  =  crh*  .  (75) 

The  term  (74)  only  needs  computing  each  time  the  observations  being 
used  are  changed.  Observation  selections  that  span  relatively  large  volumes 
can  be  used  for  numerous  gridpoints,  reducing  computation  to  a  simple 
inner  product  of  two  column  vectors  for  all  but  the  first  grid  point.  This 
simplification  makes  the  volume  method,  where  observation  selections 
are  made  for  volumes  of  gridpoints  rather  than  individual  ones,  cost 
effective.  This  is  especially  true  where  observations  are  sparse.  On  a 
spherical  grid  the  converging  meridians  near  the  poles  create 
densely  spaced  gridpoints,  so  that  about  20%  of  the  total  gridpoints  are 
easily  assigned  to  the  two  polar  volumes. 


» 
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Substitution  of  equation  (72)  into  equation  (68)  gives  the  equation 
for  the  expected  analysis  error 


'G*)2=l-h[M-lh*. 


(76) 


The  expected  analysis  errors  are  useful  in  the  comparison  of  strategies 
for  observation  utilization,  but  are  not  needed  during  the  operational 
analysis. 

Even  though  much  effort  goes  into  removing  observations  contaminated 
for  one  reason  or  another  (see  Baker,  191  a-d),  some  remain  questionable 
and  require  further  checking  by  the  internal  quality  control  procedures. 
The  steps  to  the  internal  quality  control  are  the  following: 

1.  Questionable  observations  are  tagged.  An  observation  becomes 
questionable  when  it  either  marginally  passes  the  external  quality  control, 
or  it  departs  from  the  background  fields  by  an  amount  that  exceeds 
three  times  the  expected  error  variance. 

2.  Observations  that  depart  from  the  background  field  by  an  amount 
that  exceeds  five  times  the  expected  error  are  tagged  for  gross  rejection. 

3.  The  change  in  the  analysis  done  to  a  questionable  observation  is 
evaluated  by  comparing  the  analysis  at  the  location  of  the  observation 
with  and  without  the  observation. 

4.  If  the  change  due  to  the  observation  exceeds  expected  tolerances, 
then  it  is  rejected  by  the  analysis. 

5.  Rejected  observations  are  kept  in  the  analysis,  but  their  weight  is 
constrained  to  zero.  This  is  done  without  resolving  equation  (74). 

The  mathematics  used  to  determine  the  change  done  to  an  individual 
observation,  its  expected  change,  and  procedures  for  constraining  the 
weights  of  rejected  observations  to  zero  are  presented  below.  These 
equations  are  also  found  in  Rutherford  (1978)  and  Lorenc  (1981)  in 
more  abbreviated  form.  The  expected  change  of  an  individual  observation 
is  related  to  the  variance  between  the  analysis  and  that  observation, 

{rk-qk)2=(r?)~  2  +(<?*) .  (77) 


Substituting  equation  (55)  gives 

^2  =  WI(<I*<£)W*-  (7«) 

Since 


e°p,-7ti 


(79) 
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it  follows  that 


(qqr)=M, 


(80) 


and 

7*  T 

{rtlk)  =  wjk(q<7*)  =  wkmk  ,  (81) 

where  m*  is  the  ldh  column  of  M. 

The  average  squared  difference  between  analysis  and  observation  is 
now 


{rt-Qk)2  =  w[Mw*  -  2w[m*  +  1  +  (e*)2  .  (82) 

Minimizing  this  equation  through  optimization  of  the  weight  coefficients 
gives 


Mw*  +  w^M  -  2tn*  =  0  , 


(83) 


or 


w J  =  M-1m*  =  dt .  (84) 

The  vector  d*  is  zero  except  for  the  k!h  element,  which  is  one.  The 
minimization  of  equation  (82)  with  the  suspect  observation  contributing 
to  the  analysis  is  not  useful  and  produces  only  a  trivial  solution,  and 
therefore  the  minimization  must  be  constrained. 

Minimizing  the  averaged  squared  difference  between  the  analysis  and 
observation  subject  to  the  constraint  that  selected  observations  arc  given 
zero  weight  requires  that  equation  (82)  be  minimized  subject  to  the 
constraints 

dmWt=0Vme/,  (85) 

where  Vm  e  l  is  to  be  read  for  all  mfrom  the  set  that  makes  up  the  list 
of  rejected  observations ,  /. 

A  constrained  minimization  of  the  analysis  error  can  be  achieved 
with  variational  calculus,  or 

{rk-gk)2  =  -2wjmt  +  1  +  (e*)2-2£  ?£d^w*  ,  (86) 

Vm 


where  X*,  is  the  Lagrange  multiplier  used  to  insure  the  constraint,  equa¬ 
tion  (85),  is  exactly  satisfied.  Taking  the  first  variation  of  equation  (86) 
with  respect  to  wk  and  and  setting  the  result  to  zero,  gives  an 
equation  set  whose  solution  is  the  optimum  column  matrix,  wt,  that 
gives  zero  weighting  to  the  rejected  observations,  i.e.. 
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0=  2wtM-2mjk  +  2DX*  +  2DTw*  , 


(87) 


The  columns  of  D  are  made  of  dmVm  e  l.  To  solve  for  v/k  is  eliminated 
by  multiplying  both  sides  of  this  equation  by  M-1  and  then  Dr,  or 

X*  =  -  [DtM-1D]-1D7 d  *  .  (88) 

The  expected  change  caused  by  an  individual  observation  is  computed 
by  substitution  of  equations  (85)  and  (72)  into  equation  (86),  or 

{rk  - Qk)2  =  (G *) 2  +  1  ~  w*m*  •  (89) 

Observations  are  rejected  when  they  cause  a  change  that  exceeds  their 
expected  change  according  to 

{rk-qk)*>T*\{rk-  **}*  +  0.l].  (90) 

The  value  of  the  tolerance,  T,  is  used  to  tune  the  rejection,  and  the 
analysis,  rk ,  excludes  the  observation  at  k  and  others  already  rejected. 
This  method  of  evaluation  is  a  way  to  compare  an  individual  observation 
with  all  of  the  other  observations  and  the  first  guess.  It  even  accounts 
for  differences  in  observation  density,  where  an  individual  observation 
is  expected  to  impact  an  analysis  more  over  data  sparse  than  data-dense 
areas. 

Each  questionable  observation  is  tested  using  equation  (90).  Once  all 
the  observations  that  fail  the  test  are  identified,  the  solution  for  optimal 
weights  is  again  computed  through  minimization  of  the  equation  for  the 
ensemble  average  of  squared  analysis  error,  but  this  time  subject  to 
the  constraint  that  the  rejected  observations  have  zero  weight.  The  optimum 
solution  comes  from  minimization  of 

(e*)2  =  l-2w*h*  +  w*Mw*X  ,  (91) 

Vm 


with  respect  to  the  independent  variables  wt  and  X*.  This  gives 

0  =  -  2h*  +  2w|’M  +  2DJl*  +  2D7Wt .  (92) 

Multiplying  both  sides  of  this  equation  by  M  -1  and  D  eliminates  wt. 
thereby  giving  the  solution  for  the  Lagrange  multiplier, 

X*  =  -  (Dr  M  -JD)-1  IF  M-«h t .  (93) 

Substituting  this  equation  into  the  equation  for  weighting  coefficients, 
equation  (92),  gives  the  final  equation  for  the  optimum  weighting 
coefficients. 
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w*  =  M-»h  *  -  M-»D  (Dt  M-*D)",Dr  M“>h  *  . 


(94) 


The  final  analysis  with  rejected  observations  removed  becomes 

r*  =  h[(M-1-M-iD[Dr  M-»D]_,Dr  M-‘)q.  (95) 

Collecting  the  terms  that  are  independent  of  the  grid  point  location  into 
a  single  term  gives 

rk  =  cTqhk,  (96) 


where 

cq  =  [M"1  -  M-1D[Dr  M_1D]_1Dr  M'^q  .  (97) 

This  equation  represents  the  optimum  analysis  subject  to  the  constraint 
that  rejected  observations  have  zero  weight.  A  comparison  of  the 
coefficient  equation  for  the  optimum  solution,  (74),  with  the  equation 
for  optimum  solution  using  constraints,  (97),  reveals  that  the  two  solutions 
can  be  computed  as  simply  as  one,  because  the  second  solution  is  a 
function  of  the  first.  The  only  additional  work  required  for  the  second 
solution  is  the  inversion  of  [DrMD],  which  is  a  square  matrix  with 
order  equal  to  the  number  of  rejected  observations.  Typically,  the  solution 
for  300  observation  values  will  require  the  inversion  of  a  sixth  order 
matrix,  corresponding  to  6  rejected  values. 

The  theory  above  is  used  in  the  analysis  of  observations  using  the 
following  steps: 

1.  All  observation  values  are  converted  to  differences  between  observed 
and  predicted  values.  This  is  done  by  interpolating  the  gridded  prediction 
to  observation  location,  and  subtracting  the  predicted  value  from  the 
observed  value.  The  interpolation  is  done  using  a  cubic-spline  algorithm 
designed  to  take  advantage  of  the  vector  speeds  on  the  computer. 

2.  A  grid  volume  is  defined  and  the  observations  appropriate  for  this 
volume  are  selected,  up  to  a  limit  of  300  values. 

3.  The  normalization  coefficients  are  used  to  convert  the  observations 
into  nondimensional  space. 

4.  The  pi  edict’ on  error  covariance  models  are  used  to  compute  the 
covariances  between  the  observations,  and  the  results  are  stored  in  M. 

5.  The  covariance  matrix,  M,  is  decomposed  using  Cholesky 
decomposition.  Again,  this  is  done  using  an  algorithm  designed  to  take 
advantage  of  the  vector  speed  of  the  computer.  This  makes  it  possible 
to  apply  the  factor  M-1  by  multiplication  of  the  matrices  that  result 
from  the  decomposition. 

6.  The  changes  in  the  analysis  due  to  each  questionable  observation 
are  compared  to  their  expected  change  using  equation  (86),  and  those 
failing  to  meet  the  conditions  of  equation  (90)  are  rejected. 

7.  The  column  matrix  (equation  74)  is  defined  from  M_1  and  the  list 
of  rejected  observations. 
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8.  For  each  gridpoint,  k,  the  covariances  are  computed  between 
observation  locations  and  gridpoint  location,  and  stored  in  h*. 

9.  The  inner  product  between  ck  and  h*  is  computed  as  in  equa¬ 
tion  (96)  to  get  the  analysis  at  location  k. 

10.  The  previous  two  steps  are  repeated  through  all  points  on  the 
grid. 

11.  The  normalization  coefficients  are  used  to  convert  the  analysis  to 
dimensional  space. 

12.  Steps  2  through  1 1  are  repeated  until  all  grid  volumes  are  complete, 
then  the  grid  volumes  are  meshed  with  each  other  to  form  the 
completed  analysis  increment. 

13.  The  increments  are  interpolated  to  the  model  coordinates  and 
added  to  the  prediction  values.  This  updates  the  prediction  model,  thereby 
generating  the  analysis. 

The  bulk  of  the  computer  code  used  to  generate  an  analysis  from  the 
steps  above  is  in  the  data  base  interfaces,  the  observation  sorting,  selection, 
and  tagging.  It  is  difficult  to  validate  this  code  except  to  track  individual 
data  values  through  the  process,  which  was  done  in  a  variety  of  ways. 
The  theory,  likewise,  is  difficult  to  validate  because  of  the  lack  of 
analytic  examples  to  generate  exact  cases.  Some  of  the  validation  tests 
that  illustrate  features  of  the  analysis  are  given  in  the  next  section. 


6.0  Validation  of 
Analysis  System 


6.1  Analysis  Constraints 


In  the  development  of  prediction  models,  the  accuracy  of  the  numerical 
product  is  frequently  compared  against  a  known  result.  The  nature  of 
objective  analysis  methods  does  not  typically  require  that  this  be  done 
unless  one  wishes  to  insure  against  coding  or  design  errors  whose  effect 
is  too  small  to  be  detected  when  the  system  is  exposed  to  imperfect  real 
situations.  This  insurance  against  small  errors  is  valuable  because  it  is 
these  errors  that  go  undetected  for  years,  and  cause  sophisticated  analysis 
methods  to  perform  as  poorly  as  the  less  rigorous  simple  ones. 

A  complete  suite  of  tests  are  available  to  test  multivariate  optimum 
interpolation  against  theory.  The  ability  of  the  analysis  system  to  retain 
constraints,  the  impact  of  observational  correlated  error,  the  accuracy 
of  the  vertical  interpolation,  the  impact  of  the  poles  in  the  spherical 
grid,  the  computation  of  the  covariance  model  from  analytical  fields, 
and  a  nonlinear-least-squares  fitting  algorithm  have  been  tested.  These 
tests  are  also  valuable  in  their  illustration  of  the  strengths  of  the 
multivariate  optimum  interpolation  methodology.  A  few  of  the  more 
important  tests  are  given  here  to  demonstrate  the  features  built  into  the 
program. 

Two  constraints  can  be  imposed  through  the  proper  selection  of  the 
appropriate  parameters  when  the  covariances  are  computed. 

The  geostrophic  constraint  is  controlled  by  the  parameter  p,  which 
must  be  set  to  one  for  full  geostrophic  coupling.  Since  the  geostrophic 
constraint  is  also  nondivergent,  the  divergence  parameter,  v,  must  be 
set  to  zero  to  give  a  purely  rotational  wind  field. 

The  geostrophic  constraint  was  tested  using  a  line  of  gridpoints  10  km 
apart.  Observations  were  located  at  the  endpoints  of  this  line.  The  analyzed 
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winds  were  then  compared  with  the  wind  computed  from  the  centered 
finite  difference  equivalence  of  u  =  -/  Vz4>.  The  results  of  this  experiment 
are  shown  in  Figure  1  for  the  right-hand  side  of  the  grid,  since  the 
solutions  are  symmetrical  about  the  center.  As  shown  in  Figure  1, 
the  computed  and  analyzed  winds  were  identical  to  each  other,  illustrating 
that  the  geostrophic  constraint  is  precise.  It  should  be  remembered  that  the 
analysis  constraints  are  only  imposed  on  the  increments  between 
the  analysis  and  the  prediction,  and  not  the  total  field.  Applied  only 
to  the  increments,  it  can  be  argued  that  the  nonlinear  effects  are  negligible 
and  that  the  geostrophic  coupling  is  satisfactory.  Williamson  and  Daley 
(1983)  discovered  situations  where  the  nonlinear  component  is  important, 
and  they  proposed  a  unified  method  to  overcome  the  deficiency.  Their 
method  adjusts  nonlinear  components  of  the  flow  through  iteration 
between  the  analysis  and  initialization;  a  technique  that  should  be  tested 
with  the  Navy  system. 

Even  though  the  parameters  may  be  set  for  full  coupling  between 
winds  and  heights,  the  resulting  analysis  may  lack  this  coupling.  This 
is  caused  by  two  factors: 

1.  The  grid  used  to  compute  geostrophic  winds  may  lack  the  resolution 
required  to  pick  up  the  gradients  that  exist  on  the  analysis  surface,  and 

2.  The  observations  used  to  analyze  one  area  may  not  be  used  to 
analyze  an  adjacent  area. 

This  latter  factor  is  illustrated  in  Figure  2  where  the  observations 
used  over  the  center  half  of  the  grid  were  different  from  the  ones  used 
over  the  outer  quarters;  the  center  half  of  the  grid  included  a 
wind  observation  equal  to  the  geostrophic  wind  computed  from  the  two 
height  observations  located  at  the  endpoints.  Note  the  discontinuity  in 
Figure  2  at  the  boundary  between  the  analysis  done  with  all  observations 
and  that  done  only  with  heights.  The  analyzed  winds  reflect  the  geostrophic 
winds  of  the  analysis  surface,  whereas  the  computed  winds  reflect 
the  differences  between  the  two  surfaces.  This  effect  may  not  be 
obvious  where  the  observations  are  selected  for  gridpoints  (see 


Figure  1.  The  analyzed  pressure  height  and  computed  wind  from  height  using 
the  geostrophic  relationship  plotted  with  the  true  solution  for  height.  The 
analysis  was  made  from  two  height  observations  located  at  + 400  km  and 
-400  km.  and  a  single  wind  observation  at  the  origin. 


The  Development  of  the  Navy’s  Multivariate  Optimum  Interpolation  Analysis  System 


19 


0  100  200  300  400 


DISTANCE  (km) 

Figure  2.  Same  as  Figure  1  except  the  left  half  of  the  grid  was  analyzed  using 
a  different  set  of  observations  from  the  right  side.  The  discontinuity  in  the 
height  surface  in  the  center  is  where  the  switch  from  one  data  set  to  the  other 
was  made. 

Barker,  1992),  but  where  they  are  picked  for  grid  volumes  the  analysis 
surface  in  one  volume  may  be  significantly  different  from  the  surface 
in  an  adjacent  volume.  Consequently,  the  volumes  are  designed  so  that 
they  overlap  and  the  discontinuity  can  be  made  smooth  through  weighted 
averaging  of  the  overlapping  surfaces.  Unfortunately,  the  constraints 
are  not  retained  through  this  averaging  procedure. 

The  constraint  governing  the  divergence  is  illustrated  using  a  single 
wind  observation  at  the  center  of  the  analysis  grid.  The  results  for  fully 
rotational  (v  =  0),  fully  divergent  (v  =  1),  and  a  mixture  of  divergent 
and  rotational  (v  =  .5)  are  shown  in  Figure  3.  It  is  of  interest  to  note 
that  the  analysis  comprised  of  both  divergent  and  rotational  components 
produced  corrections  along  the  direction  of  the  observed  wind  only. 

The  impact  of  the  coupling  of  a  multivariate  analysis  was  tested  in 
a  global  data  assimilation  experiment.  The  system  used  was  NOGAPS 
version  2  (the  current  version  is  3.2).  The  fit  of  the  analysis  first-guess, 
which  is  the  predicted  values  from  the  previous  analysis,  was  used  to 
measure  the  success  of  the  two  systems.  In  both  experiments,  the  analysis 
equally  fit  the  observations,  but  first-guess  errors  were  about  10%  smaller 
in  the  coupled  system.  These  results  are  shown  in  Figure  4. 

6.2  Expected  Analysis  Error  The  expected  analysis  error  is  the  quantity  that  is  minimized  in  the 

optimization  of  the  weighting  coefficients  used  in  the  analysis.  This 
quantity  is  useful  in  the  illustration  of  the  power  of  the  optimum 
interpolation  methodology,  and  it  is  readily  computed  from  equation  (76). 
The  capability  of  the  analysis  to  effectively  utilize  various  types  of 
observations  was  illustrated  for  the  following  situations:  correlated 
observational  error  compared  to  random  error;  height  and  wind  analysis 
accuracy  compared  for  computations  from  height  observations  alone, 
wind  observations  alone,  or  a  combination  of  heights  and  winds.  These 
experiments  are  similar  to  the  ones  illustrated  by  Bengtsson  (1976). 

Correlated  observation  error  may  occur  when  a  single  platform,  such 
as  a  satellite,  is  used  to  produce  many  observations.  Should  the  errors 
be  caused  by  the  instrument  or  the  algorithm  that  converts  the  sensor 
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Figure  3.  Wind  analysis  using  a  single  wind  value.  The  divergence  parameters  was  set  for  (a)  fully  rotational, 
(b)  fully  divergent,  and  (c)  a  combination  of  both  divergent  and  rotational. 


signal  to  an  environmental  measurement,  then  all  of  the  observations 
taken  by  the  instrument  will  have  similar  errors.  The  ability  of  the 
optimum  interpolation  to  extract  gradient  information  from  a  system 
that  contains  correlated  errors  is  shown  in  an  experiment  in  which  a 
series  of  analyses  were  performed  from  four  observations.  The  analyses 
were  identical  except  for  the  location  of  the  observations,  which  were 
positioned  closer  together  in  each  succeeding  analysis.  The  estimated 
analysis  error  in  the  center  of  the  observation  grid  was  computed.  The 
results  of  a  series  of  analyses  were  plotted  to  show  how  the  estimated 
error  changed  with  observation  spacing.  Figure  5  shows  the  results  of 
two  series  of  analyses,  one  with  random  and  the  other  with  correlated 
observation  error.  The  estimated  analysis  error  in  the  wind  is  smaller 
when  the  height  observation  errors  were  correlated  than  when  they  were 
random,  illustrating  the  capability  of  the  analysis  to  extract  gradient 
information  from  otherwise  poor  data.  The  estimated  error  in  the  height 
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Figure  4.  Root-mean-square  differences  between  radiosonde  heights  and 
resulting  analysis,  and  between  radiosonde  heights  and  the  first  guess  for 
multivariate  analysis  and  univariate  analysis. 


Figure  5.  Estimated  root-mean-square  error  in  analysis  of  (a)  height  and  (b)  wind, 
at  the  center  of  a  four-point  observation  grid  vs.  radial  distance  for  four 
height  observations  comparing  random  observation  errors  with  correlated 
observation  errors. 


analysis  was  least  when  the  observational  error  was  random,  illustrating 
that  the  optimum  interpolation  is  able  to  remove  random  error  by  averaging 
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the  data  values,  and  this  capability  is  improved  with  observation  density. 
From  the  viewpoint  of  utilizing  satellite  data,  increased  observation 
density  improves  the  height  analysis  when  the  errors  are  random,  and 
improves  the  wind  analysis  when  they  are  correlated. 

Wind  observations  alone  may  not  produce  an  accurate  height  analysis 
because  winds  can  only  contribute  to  the  gradients.  This  was  shown 
using  the  method  described  above  except  that  the  four  observations 
were  wind.  A  second  series  of  analyses  were  done,  but  with  a  height 
observation  added  to  one  of  the  locations.  A  comparison  of  the  expected 
errors  in  heights  for  the  two  observation  configurations.  Figure  6,  shows 
that  although  the  errors  were  large  when  winds  alone  were  used,  adding 
a  single  height  observation  dramatically  decreased  the  height  errors  by 
as  much  as  50%. 

The  advantage  of  the  multivariate  analysis  was  also  illustrated  by  the 
simulation  of  an  aircraft  making  observations  through  the  southeast 
sector  of  a  low  pressure  system.  Three  different  aircraft  sensor 
configurations  were  simulated:  wind  sensor  alone,  height  sensor  alone, 
and  a  combined  wind  and  height  sensor.  The  simulated  observations  are 
shown  in  Figure  7.  The  true  fields  are  shown  with  the  analyzed  fields 
in  Figures  8  and  9.  The  wind  sensor  alone  simulation  gave  a  reasonable 
location  for  the  storm,  but  it  was  too  weak.  The  height  sensor  alone 
simulation  produced  a  more  intense  storm  in  the  analysis,  but  it  too  was 
weak,  and  its  location  was  too  far  southeast.  The  combined  sensor 
produced  the  best  results,  although  it,  too,  lacked  intensity.  The  fit  of 
the  analysis  to  the  simulated  observations  and  to  the  true  field  is  shown 
in  Figure  10.  When  verified  against  the  observations,  the  analysis 
agreed  with  height  observations  when  using  the  height-only  sensor,  and 
it  agreed  with  wind  observations  when  using  the  wind-only  sensor. 
Compared  to  the  true  field,  however,  the  wind-only  sensor  produced 
least  accurate  results,  followed  by  the  height-only  sensor.  The  combined 
wind  and  height  sensor  produced  the  most  accurate  analysis  reducing 
the  height  analysis  from  the  wind-only  sensor  by  47%  and  reducing  the 
height  analysis  error  by  25%. 


Figure  6.  Estimated  root-mean-square  error  in  analysis  of  height  at  the  center 
of  a  four-point  observation  grid  vs.  radial  distance  far  four  height  observations, 
four  wind  observations,  and  four  wind  observations  plus  one  height  observation. 
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Figure  9.  Source  as  Figure  8  except  for  winds. 
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Figure  10.  Root-mean-square  error  as  determined  from  the  simulated 
observations  and  the  true  fields.  The  errors  are  for  three  simulations:  wind 
sensor  alone,  height  sensor  alone,  and  combined  wind  and  height  sensor.  The 
left  half  of  the  graph  is  for  verification  against  observations  and  the  right  half 
is  for  verification  against  the  true  field. 

6.3  Vertical  Interpolation  The  vertical  consistency  of  the  analysis  is  best  maintained  with 

observations  of  wind  and  temperature  profiles  that  extend  through  the 
analysis  layer.  To  use  single  level  observations  such  as  aircraft 
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measurements  and  cloud-track  winds  requires  determination  of  appropriate 
vertical  covariance  function  of  the  prediction  error  variance.  The  example 
illustrated  shows  how  satellite  temperature  profiles  may  be  combined 
with  single-level  observations  so  that  they  support  each  other. 

Satellite  derived  soundings  do  not  contain  the  reference  level  needed 
to  convert  inferred  temperatures  to  geopotential  height,  so  the  observations 
are  given  in  thicknesses  between  the  standard  pressure  surfaces.  A  common 
method  to  analyze  satellite  soundings  has  been  to  analyze  a  reference 
level,  and  then  convert  the  satellite  derived  thicknesses  to  geopotential 
heights.  Using  MVOI,  satellite  derived  thicknesses  can  be  input  directly 
and  blended  with  available  reference  data. 

To  test  this  feature,  two  analyses  were  computed.  One  analysis  was 
given  a  single  thickness  observation,  and  the  other  was  given  a  thickness 
and  a  pressure  height  observation.  To  make  the  results  easy  to 
interpret,  the  observations  were  assumed  to  contain  no  error.  After 
subtracting  the  predicted  estimate,  the  thickness  increment  of 
the  500-  to  400-mb  layer  was  -200  m,  and  the  height  increment  of  the 
300-mb  pressure  surface  was  0  m.  In  this  test,  the  horizontal  locations 
of  the  analysis  gridpoints  and  the  observations  were  horizontally 
collocated.  The  vertical  covariance  model  relative  to  the  500-mb  surface 
is  shown  in  Figure  1 1 .  The  two  analyses  and  vertical  locations  for  both 
analyses  are  shown  in  Figure  13. 

The  analysis  increment  from  the  single  thickness  observation  produced 
a  500-  to  400-mb  layer  thickness  correction  of  -200  m.  Note  that  the 
largest  correction  occurred  at  300  mb.  In  the  second  analysis,  the  300  mb 
height  observation  was  added.  The  analysis  increment  for  the  500-  to 
400-mb  thickness  was  again  -200  m,  but  this  time  the  analysis  increment 
at  300  mb  was  0  m,  corresponding  to  the  value  of  the  added  observation. 
In  other  words,  with  zero  observation  error,  the  analysis  drew  to 
the  observations,  in  effect  using  the  height  observation  as  a  reference  level. 
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Figure  12.  Analysis  increment  computed  from  a  single  thickness 
observation  shown  with  an  analysis  increment  computed  from 
the  single  thickness  observation  and  a  height  observation.  Both 
observations  were  assumed  to  contain  no  error. 


150-100  - 
200-150  - 
__  250-200  - 

t 

ff  300-250  - 

uj  400-300  |- 
cc 

CO 

jS  500-400  (- 
ct 

700-500 
850-700 
1000-850  h 


THK(OE) 

ESSSS  THK  +  HT  (OE) 


m\\\N 


x 


X 


X 


X 


X 


-250 


-200 


-150 


50 


-100  -50  0 

THICKNESS  (m) 

Figure  13.  Plots  of  the  thickness  of  the  analyses  shown  in  Figure  12 
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In  another  pair  of  tests,  the  observations  were  assumed  to  contain 
random  error  variance  at  80%  of  the  predicted  estimate.  The  analysis 
increments,  shown  in  Figure  14,  were  about  45%  smaller  than  the  results 
above,  and  closer  to  the  increments  that  would  be  computed  in  an  actual 
situation.  The  analysis  did  not  draw  exactly  to  the  data,  but  instead  was 
making  a  compromise  between  the  predicted  estimate  and  the  observations. 
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Figure  14.  Same  as  Figure  12  except  for  observations  with  errors 
equal  to  80%  of  the  prediction  mean  error. 


6.4  Analysis  Near  Poles 
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Figure  15.  Two  wind  observations  representing  the  same  flow  shown  plotted  on  | 

a  spherical  grid  and  the  resulting  analysis.  J 

Winds  lose  their  definition  of  direction  on  a  spherical  grid  near  the 
poles.  To  eliminate  this  problem,  the  analysis  is  done  with  the  wind 
components  projected  to  a  polarstereographic  projection,  and  then  the 
product  converted  back  to  a  spherical  grid.  The  success  of  this  procedure  _ 

was  illustrated  by  placing  two  wind  observations  near  the  north  pole. 

Although  they  represent  die  same  flow,  projected  onto  a  spherical  grid, 
they  appear  to  flow  in  opposite  directions.  The  analysis  shown  in  both 
spherical  projection.  Figure  15,  and  polarstereographic  projection. 

Figure  16,  agree  with  the  input  observation.  Tests  of  the  geostrophic 
constraint  in  the  polar  volume  across  the  pole  gave  results  as  accurate  M 

as  other  areas.  ] 


7.0  Data  Selection 
Procedures 


The  volume  method  of  MVOI  pertains  to  the  procedures  used  to  J 
select  observations  for  analysis  at  a  particular  gridpoint.  Historically, 
observations  were  selected  for  individual  gridpoints,  but  limited  computer 
time  and  the  belief  that  five  observations  containing  height  and  wind 
would  be  sufficient  to  acquire  the  necessary  accuracy  kept  the  typically 
selected  number  of  observations  small.  As  the  computer  power  increased, 
most  centers  increased  the  number  of  observations  used  to  analyze  a  m 
single  gridpoint,  and  the  increased  numbers  surprisingly  improved  the  1 

analysis.  I 
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X  (1000  km) 

Figure  16.  Same  as  Figure  15  except  plotted  on  a  polarstereographic  grid. 


Lorenc’s  (1981)  solution  to  using  large  numbers  of  observation  per 
gridpoint  is  to  make  the  selection  for  volumes  of  gridpoints.  This  lessens 
the  computational  work  load  since  equation  (73)  needs  to  be  solved 
only  once  for  volumes  containing  from  1 ,000  to  40,000  points. 

The  results  in  section  6  show  how  a  discontinuity  can  occur  in  an 
analysis.  Since  there  is  limit  to  the  density  of  gridpoints  one  may  choose, 
the  solution  to  the  analysis  equations  for  a  particular  set  of  observations 
and  error  functions  is  actually  a  smoothly  varying  surface.  When  the 
observations,  functions,  or  both  are  changed  from  one  area  to  the  next, 
the  analysis  surfaces  may  take  on  the  appearance  of  Figure  2,  in  contrast 
to  Figure  1,  where  the  error  functions  and  observations  are  the  same 
throughout.  Dynamic  consistency  of  the  analysis  is  best  obtained  using 
the  volume  method,  because  it  tends  to  fit  the  observations  as  well 
as  the  gridpoint  method,  plus  it  is  smoother.  Obviously,  the  gridpoint 
method  converges  to  the  volume  method  as  the  observation  selection 
and  error  functions  become  similar. 

With  these  characteristics  in  mind,  the  data  selection  scheme  was 
designed  to  achieve  several  goals: 

1.  Achieve  vertical  and  horizontal  smoothness, 

2.  Maximize  analysis  volumes  over  the  polar  regions  where  gridpoint 
density  is  largest, 

3.  Maximize  the  influence  of  radiosondes  below  100  mb, 

4.  Maximize  the  influence  of  satellite  soundings  above  100  mb. 
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8.0  Internal  Quality 
Control 


5.  Maximize  the  distribution  of  observations  used  in  the  analysis, 

6.  Minimize  computations, 

7.  Utilize  as  many  of  the  observations  as  feasible. 

The  characteristics  of  the  resultant  configuration  of  observation  selection 
are  the  following: 

1.  Analyze  over  the  largest  possible  polar  volume  that  contains  300 
data  values,  which  produced  a  polar  volume  that  included  all  gridpoints 
poleward  of  ±70°. 

2.  Spatially  vary  observation  selection  volumes  to  utilize  most 
observations  within  a  region.  This  tactic  produced  volumes  over  continents 
that  were  half  as  large  as  the  volumes  over  the  oceans  of  the  Southern 
Hemisphere. 

3.  Profile  observations  such  as  radiosondes,  satellite  soundings,  and 
PIBALS  were  either  wholly  selected  or  completely  excluded,  according 
to  a  priority  list  for  observation  selections  instead  of  arbitrarily  selecting 
values  from  all  profiles.  This  was  done  to  insure  vertical  consistency 
and  smoothness. 

4.  The  priority  list  of  observation  selection  starts  with  radiosondes 
and  surface  observations,  and  then  works  through  satellite  derived 
thicknesses,  PIBAL  winds,  AIREPs,  and  finally  satellite  winds. 

5.  To  get  the  selected  observations  to  be  evenly  distributed  over  the 
volume,  the  observations  were  first  sorted  by  latitude.  The  selection 
strategy  involves  choosing  every  fourth  observation  or  so  starting  with 
the  first  observation  the  first  time  through,  the  second  observation  the 
second  time  through,  etc. 

The  strategy  for  removing  observations  with  errors  is  not  easy,  yet  the 
quality  of  a  forecast  may  fully  depend  on  a  correct  decision  in  data 
quality  control.  The  difficulty  can  be  illustrated  using  the  simulated 
aircraft  scenario  of  section  6.2  as  plotted  in  Figure  17.  In  this  illustration, 


Figure  17.  Same  as  Figure  7a,  but  for  one  observation  in 
error.  The  observations  rejected  by  the  analysis  are  marked 
by  an  x. 
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Figure  18.  The  analysis  of  height  using  the  observations  shown  in  Figure  17  (a)  when  only  the  erroneous  observation 
(reported  as  +50  m)  was  marked  suspicious  and  (b)  when  the  erroneous  observation  and  its  two  closest  neighbors 
were  marked  suspicious. 


the  aircraft  had  a  height-only  sensor,  but  one  of  the  observations  came 
in  as  +50  m  instead  of  -219  m.  Two  strategies  were  used  to  remove  the 
error.  In  one  instance,  only  the  erroneous  observation  was  marked 
suspicious,  and  in  the  other  the  erroneous  observation  and  the  observations 
on  either  side  were  marked  suspicious.  The  solutions  for  the  two  cases 
were  completely  different  (Fig.  18).  When  only  the  observation  in  error 
was  marked  suspicious,  it  was  correctly  eliminated,  but  when  the  erroneous 
observation  and  its  two  nearest  neighbors  were  marked  suspicious,  a 
key  observation  was  eliminated.  The  eliminated  observations  are  shown 
in  Figure  17  marked  by  an  "x"  for  the  test  where  more  than  the  erroneous 
observations  were  marked  suspicious. 

Clearly,  the  problem  illustrated  here  is  the  result  of  a  sparsely  observed 
region.  But  it  is  this  sparsity  that  makes  the  quality  control  solution  so 
critical.  It  is  recognized  that  as  much  preprocessing  as  possible  be  utilized, 
and  that  the  internal  quality  control  be  given  as  much  assistance  as 
possible  to  make  the  correct  decisions. 

There  are  many  other  things  that  can  be  done  to  insure  that  only  the 
highest  quality  observations  get  into  the  analysis,  and  the  process  is 
continually  evolving. 

For  example,  work  is  ongoing  to  utilize  the  performance  history  of 
the  observations  and  to  eliminate  consistently  bad  stations  (see  Baker 
et  al.,  1991). 


9.0  Conclusions  Although  there  are  newer  and  better  methods  being  developed,  there 

are  still  things  that  can  be  done  to  make  the  Navy  MVOI  give  better 
results.  Suggestions  based  on  the  experience  gained  since  the  system 
became  operational  in  1988  are  presented  below. 

The  MVOI  couples  the  wind  and  geopotential  linearly  using  the 
geostrophic  relationship.  It  has  been  shown  by  Williamson  and  Daley 
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